Skip to content

dotpot/FastLinks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

##FastLinks Missing simple links parser for python & humans

Use this component if you want to get http links from content in a fast ( very ) way.

###Overview

Imagine you have this html content:

<LINK REL="SHORTCUT ICON" HREF="favicon.ico" />

src='/clickme.php?id=10&amp;stats=23d'

URL="http://www.testsite.com/verygood.html"

href='www.testsite.com/hello placentas/word.htm'
href='../test.html'

And all you want to do is just get list of normal looking links from it.

###You can do it now!!

just:

 links = get_links(content, 'http://www.testsite.com/')

Isn't that trolololowesome ?!

output:

[1] http://www.testsite.com/test.html
[2] http://www.testsite.com/hello placentas/word.htm
[3] http://www.testsite.com/favicon.ico
[4] http://www.testsite.com/verygood.html
[5] http://www.testsite.com/clickme.php?id=10&stats=23d

Please feel free to improve it if you like :)

image

Also you can try (more power on data mining) CustomStringParser

About

Fast links parser for Python & Humans

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages